Producing Scores for Customers via Ensembling SVM
نویسندگان
چکیده
Supervised by Dr. Yilong Yin Email:[email protected] School of Computer Science and Technology, Shandong University Jinan 250100, China Abstract This report shows our solution to PAKDD Competition 2007. Following a brief description of the data mining task, we discuss four difficulties to be dealt with in this task. Then, we show how to do the data pre-processing. To weaken class-imbalance of the modeling dataset externally, we combine Under-sampling and Over-sampling techniques. Besides, we adjust the parameters of each support vector machine internally to solve cost-sensitivity. Next, we get an ensemble of SVM to achieve higher accuracy. In the end, we present the essence of the model and provide some cues for the consumer finance company.
منابع مشابه
Predicting Future Customers via Ensembling Gradually Expanded Trees
This report presents our solution to PAKDD’06 Data Mining Competition. Following a brief description on the task, we discuss the difficulties of the task and explain the motivation of our solution. Then, we propose the GetEnsemble (Gradually Expanded Tree Ensemble) method, which handles the difficulties via ensembling expanded trees. We evaluated the proposed method and several other methods us...
متن کاملFDiBC: A Novel Fraud Detection Method in Bank Club based on Sliding Time and Scores Window
One of the recent strategies for increasing the customer’s loyalty in banking industry is the use of customers’ club system. In this system, customers receive scores on the basis of financial and club activities they are performing, and due to the achieved points, they get credits from the bank. In addition, by the advent of new technologies, fraud is growing in banking domain as well. Therefor...
متن کاملMental Distress Detection and Triage in Forum Posts: The LT3 CLPsych 2016 Shared Task System
This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all feat...
متن کاملEnsembling Predictions of Student Post-Test Scores for an Intelligent Tutoring System
Over the last few decades, there have been a rich variety of approaches towards modeling student knowledge and skill within interactive learning environments. There have recently been several empirical comparisons as to which types of student models are better at predicting future performance, both within and outside of the interactive learning environment. A recent paper (Baker et al., in pres...
متن کاملThe Sum is Greater than the Parts: Ensembling Student Knowledge Models in ASSISTments
Recent research has had inconsistent results as to the utility of ensembling different approaches towards modeling student knowledge and skill within interactive learning environments. While work in the 2010 KDD Cup data set has shown benefits from ensembling, work in the Genetics Tutor has failed to show benefits. We hypothesize that the key factor has been data set size. We explore the potent...
متن کامل